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Abstract. In this paper, we address strongly convex programming for princi- 
pal component pursuit with reduced linear measurements, which decomposes 
a superposition of a low-rank matrix and a sparse matrix from a small set of 
linear measurements. We first provide sufficient conditions under which the 
strongly convex models lead to the exact low-rank and sparse matrix recov- 
ery; Second, we also give suggestions on how to choose suitable parameters in 
practical algorithms. 



1. Introduction. Recently, much attention has been drawn to the problem of 
recovering a target matrix from a small set of linear measurements. The estimated 
matrix is a superposition of low-complexity structure. It can be found in many 
different fields, such as medical imaging [HE [3], seismology [4], information retrieval 
[5] and machine learning [6]. 

This problem regained great attention after the publication of the pioneering 
works of E.J. Candes et al [7J [HI [HI [TD] . According to paper [TB], we can build 
the data model as follows: there exists a large-scale data matrix M = Lq + So, 
where Lq G R" x " has low-rank, and Sq is sparse component. The main question 
is how to recover a low-rank matrix Lq and sparse matrix So from a small set of 
linear measurements. In the paper |16j . E.J. Candes et.al proved that most low-rank 
matrices and the sparse components can be recovered, provided that the rank of the 
low-rank component is not too large, and the sparse component is reasonably sparse; 
and more importantly they proved that it can be done by solving a simple convex 
optimization problem, i.e. most matrices of low-rank and the sparse components 
can be perfectly recovered by solving the optimization problem 
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minimize 



||L|U+A||5| 



subject to 



L + S = M 



(1) 
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provided that the rank of the matrix L and the cardinality of the sparse component 
S obey 

rank(io) < PrnpT 1 (log n)~ 2 and m < p s n 2 

where p r and p s are positive numerical constants. 

In practice, it is necessary to develop efficient and effective tools to process, ana- 
lyze, and extract useful information from such high dimensional data (in application, 
dimensional of data is always very high). Because strongly convex optimizations 
have many advantages, such as optimal solution is unique, many scholars suggest 
solving their strongly convex approximations, see, e.g., JTJ [TH [T31 [Tl] , instead of 
directly solving the original convex optimizations. Pertaining to problem (1), the 
authors gave the suitable sufficient conditions under which the strongly convex mod- 
els lead to the exact low-rank and sparse matrix recovery. Some suggestions were 
given on how to choose suitable parameters in practical algorithms in the paper [14] . 
However, the results of paper [14] are limited in a special case, i.e. Q = R nx ™. In 
this paper, we extend this result to the principal component pursuit with reduced 
linear measurements, i.e. Q is a p-dimensional random subspace. It's easy to note 
that results of paper [T3] is only a special case of ours. 



1.1. Basic problem formulations. In this subsection, we will interpret an im- 
portant strongly convex programming which will be addressed in this paper and list 
its existence and uniqueness theorems. In the paper |15j . the authors have studied 
principal component pursuit with reduced linear measurements and given sufficient 
conditions under which Lq and Sq can be perfectly recovered. 

minimize + Aj|5||i 

subject to V Q M = V q (L + S) 

In this paper, we address a strongly convex programming. We prove it can 
guarantee exact low-rank matrix recovery. The proposed optimization is 

minimize ||L||* + A||5||i + + -^HSHf 

It It 

subject to VqM = Vq(L + S) (2) 

where r > is some positive penalty parameter and Pq is the orthogonal projection 
onto the linear subspace Q. We also assume Q 1 ^ is a random subspace (the same 
assumption considered in paper [U]). When r = oo in (2), existence and uniqueness 
theorems is provided in the paper |15) . as we list them below. In the end, how to 
choose suitable parameters in the optimization model (2) is discussed. 

Theorem 1.1. [T5] Fix any C p > 0, and let Q 1 - be a p-dimcnsional random sub- 
space of M™ x "; Lq obeys incoherence condition with parameter p, and supp(So) ~ 
Ber(p). Then with high probability, the solution of problem(2) with A = is 

exact, i.e. L = Lq and S = Sq, provided that 

Rank(Lo) < C r np~ x (log n)~ 2 p < C p n and p < po (3) 
where, C r , C p and p are positive numerical constants and po < 1. 



RUNNING HEADING WITH FORTY CHARACTERS OR LESS 



3 



1.2. Contents and Notations. We provide a brief summary of the notations 
which arc used throughout the paper. We denote the operator norm of matrix by 
||A||, the Frobcnius norm by the nuclear norm by ||A||„, and the dual norm 

of ||X by The Euclidean inner product between two matrices is defined 

by the formula (X,Y) = trace(X*Y). Note that ||X||| = (X,X). The Cauchy- 
Schwarz inequality gives (X, Y) < \\X\\f\\Y\\f, and it is well known that we also 
have (X,Y) < WXW^WYW*^, e.g. [5] [17] ■ Linear transformations which act on the 
space of matrices are denoted by VX. It's easy to see that the operator of V is high 
dimension matrix in substance. The operator norm of the operator is denoted by 
||P||. It should be noted that ||7>|| = sup { || X || F=1} ||"PX|| F . 

The rest of the paper is organized as follows. In Section 2, we list many important 
Lemmas and prove a key lemma on which our main result depends. Suggestions 
then is given in Section 3, which will guide us to choose suitable parameters in 
practical algorithms. Conclusion and further works are discussed in Section 4. 



2. Important Lemmas. In this section, we first list some useful lemmas which 
will be used throughout this paper and then prove a main lemma. Although the 
main lemma is similar to the corresponding one in the paper |15j . the construction 
of is different. That leads to our necessary additional work. 

Lemma 2.1 ([15]. Lemma 1). Suppose that dim(<2- L ©T©fi) = dim(Q- L )+dim(T) + 
dim(O). Let V = Q n T 1 - so that = © T. Assume that \\VnP v ± || < 1/2 and 
A < 1. Then, (L ,Sq) is the unique optimal solution to (2) if there exists a pair 
(W, F) e M' iX ' 1 x R" xn satisfying 



UV* +W = X(sgn(S ) + F + V n D) e Q 



with V T = 0, ||W1 < 1/2, VuF = 0, HFlloo < 1/2, and \\P n D\\ F < 1/4. 

Lemma 2.2 ([T5J, Lemma 3). Assume that ~ Ber(p) for some small p E (0, 1) 
and the other conditions of Theorem 1.2 hold true. Then, the matrix W L obeys, 
with high probability. 

(a) . H^ll < 1/4 

(b) . \\Vn(UV* + W L )\\ F <\/4: 

(c) . ||^(^* + M /L )llco<A/4 

Lemma 2.3 (|T5], Lemma 4). In addition to the assumptions in the previous 

lemma, assume that the signs of the non-zero entries of So are i.i.d. random. Then, 

the matrix W s obeys, with high probability, 

(a). HW^II < 1/8 

(6). HTVW^Iloo < A/8 

The construction of W L and W s can be found in the paper [T5J. The authors 
also introduce a new scheme to construct for the principal component pur- 
suit. However, the matrix constructed in the paper [T5] do not satisfy the 
requirement of our problem, so we have to modify this construction. We first give 
explicit construction of W®, and then, prove the modification of satisfies the 
corresponding property. 
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Construction of with least modification. We define by the following 
least squares problem: 

W Q = arg min x \\X\\ F 
subject to V Q ±X = -V Q ±{UV* + -L ) 
TnX = 

where II = T © f2. This construction of don't satisfy Theorem 2.6 only, but 
also has below Lemma. 

Lemma 2.4. Assume r > ||M||^, and that ~ Bcr(p) for some small p € (0, 1) 
and the assumptions of Theorem 1.2 hold true. Then, the matrix obeys, with 
high probability. 

(a) . \\W Q \\ < 1/8 

(b) . ||7VW Q Hao < A/8 

In proof of Lemma 2.4, we have to use two important lemmas which are listed 
below. 

Lemma 2.5 ( j!5j Lemma 11). Let Si,S 2 and S3 be any three linear subspaces in 
M. nxn satisfying dim{Si © S 2 ® S3) = dim(Si) + dim{S 2 ) + dvm(S s ), and V Sl Vs 2 < 
01,2 < ]-,'Ps-J , Ss < a 2,3 < 1 and'Ps 3 'Ps 1 < »3,i < 1. We define S = SiQS^. Tften, 
we /la-ue 



|WS 8 ||< 



z 2.3 + Q 3,l 

1 - ai, 2 



Lemma 2.6 ( |15j Lemma 7). Assume that p < n 2 /4. Let Q be a linear subspacc 
distributed according to the random subspace model. Then, with high probability, we 
have 

V Q ±V T < 8^—^ 

11 

Proof. A, bounding the behavior of |] C/V* + -Lq\\ f . For convenience, let £ := 
\\UV* + ±L \\ F . 

According to triangle inequality, we have 

\\L \\f = \\M- S \\f < \\M\\ F + \\So\\f = \\M\\ F + \\VqS \\f 
In the last equality, we have used So £ £1. Note that 

\\VciSo\\f = \\Vq{M - L )\\f < \\V n M\\ F + \\VnL \\ F 
According to the derivation in the paper |14j . with high probability, we can obtain 

\\VnL \\F<^\\V^M\\F<^-\\M\\ F 
Putting those all together, we get 

||£o||f < (\/3/3 + 2)||M||jr 
Combining with r > ||M||^, we can obtain 

g < W y*y + ||*°k < r + (^/ 3 + 2 )ll^^ < r + V3/3 + 2 
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Because is the optimum solution of least squares problem, we can use the 
convergent Neumann scries expansion. It's easy to note that 

W Q = TV ^2(V Q ±T>nP Q ±) k {P Q ±(-UV* - \l q )) 

k>0 

According to triangle inequality, we have 

\\W Q \\ F < || J2(PQ-'PnV Q .) k \\\\V Q ,(-UV* - U )\\ F (4) 

fe>0 

B, estimating the first inequality of Lemma 2.4. In order to bound HVF^Hf, 
we have to bound the behavior of || J2k>oO D Q ± ^ : ' u '^'Q ± ) k \\- Therefore, we have 



^(P Q xPri7V) fe || < ^IKPqxPnpQx)^ 

fc>0 

< J2\\V Q ±Vn\ 



k>0 fc>0 

1 2k 



k>0 

According to Lemma 2.5, we have, for any e > 0, with high probability, 



1 — \fp~- Ve \ \ \ n 2 V 4 I \\n 2 '\ln 
According to the paper [15] . we have 

ll£0V7WV)*||<| 

k>0 

with high probability. 

Next, we will bound \\Vq±(—UV* — ^L )\\ F • According to the paper [15], Pq± 
has the same distribution as H(H* H)^ 1 H* , where H G l n xp is a random Gaussian 
matrix with i.i.d. entries ~ Af(0, 1/n 2 ). Therefore, we can obtain 

\\V Q ±(UV* + ±L )\\ F 

= \\H{H*H)- l H*vec{UV* + -L )\\ F 

T 

< WH^H^WWH^eciUV* + -L )\\ 2 

T 

Together with Lemma 2.6, we can obtain 

P[||F(F* J ff)" 1 || > 4] < e"^ 

It's easy to note that any entries of H*vec(UV* + ^Lo) have the same distribu- 
tion as < G, UV* + \Lq >, where G ?J ~ A/"(0, 1/n 2 ) are independent identically 
distributed. It is obvious to see that 

.{■ G, UV* + -Lo >} =< E{G}, UV* + -La >= 

T T 

and 

Var{< G, UV* + -L >} = ^(UV* + -i )^Var{Gy ■} 

ij 

= e/n* 



(i 
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Therefore, < G, UV* + ^L > is distributed according to ~ Af(0,£/n 2 ), where 
£ := \\UV* + ±L Q \\ F . For simplicity, we define Z := H*vec(UV* + ±L ). Using the 
Jesen inequality, we have 

E[||Z|| a ] < (E[\\Z\\l]y/* -- 
According to the Proposition 2.18 in [17], we can obtain 



Zh > E l\\ z h\ +t 



Setting t = v61ogn, after a simple calculation, we can obtain 



3 \ n n 

with high probability. For sufficiently large n, the first inequality of Lemma 2.4 is 
established. 

C, estimating the second inequality of Lemma 2.4, Note that 
W Q = TV ?V E(^ :P nP Q x) ft (P Q x(-C/F* - lLo)) 

k>0 

Similar to the paper |15j . after a simple calculation, we can obtain 



where C is some constant. Note that for sufficiently large n, the second inequality 
of Lemma 2.4 is established. □ 



3. Bounding parameter r. In this section, we shall provide sufficient conditions 
under which (L$; So) is the unique solution of the strongly convex programming (2) 
with high probability. Afterwards, an explicit lower bound of r will be given as 
well, which will guide us to choose suitable parameters in practical algorithms. 

Theorem 3.1. Suppose that dim(Q- L © T © 0) = dim(Q- L ) + dim(T) + dim(fi). 
Let T = Q nr 1 so that T 1 - = Q- 1 © T. Assume that \\VnV r A\ < V 2 and A < 1. 
If there exists a pair (W, F) & R nxn x R nxn and a matrix D satisfying 

UV* + W + -L Q = \{sgn(S ) + F + V n D) + -S eQ 

T T 

with 

V T W = 0, ||W|| < p, V n F = 0, H^IU < 0, \\V n D\\ F < a (5) 
where a, (3 are positive parameters satisfying 

a + (3 < 1 (6) 



Then (Xq, Sq) is the unique solution of the strongly convex programming (2). 
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Proof. For any feasible perturbation (Hl, Hs), it's easy to note that VqHl = 
VqHs- According to the definition of T, we have T C Q, therefore VyHj, = VtHs- 
For simplicity, let f(L, S) = \\L\\„ + A||5||i + ^:||L||f. + we can obtain 

f{L +H L ,S -H s ) 

> f(L , S )+ < UV* +W + -L , H L > - < Asgn(5 ) + XF + -So, H s > 

T T 

> f(L , S ) + < W , H L > - < W, H L > + < UV* + W + -L ,V Q H L > 

T 

- < XF , H s > + < XF, H s > - < Xsgn(S ) + XF + -So,V Q H s > 

T 

> f(L , S )+ < W ,V T ±H L > - < W,V T ±H L > 

- < XF Q ,V n ±H s > + < XF,V a ±H s >-< XVnD,V Q H s > 

> f(L , So) + (1 - 0)\\V T ±H L \\* + (1 - p)X\\V n ±H s \\i - aX\\VnH s \\ F 

In the second inequality above, we have used the facts 

UV* + W + -Lo = X(sgn(S ) + F + V Q D) + -S eQ 

T T 

In the third inequality above, we have used VqH f = VqHs- 

We will bound \\VviHs^f- According to the definition of T , we have 

\\VqH s \\f < \\VnVrHs\\F + \\VnV r ±Hs\\ F 

< \\VnV r H L \\ F + -\\H s \\f 

< \\V r H L \\ F + -\\VqH s \\f + -\\V a ±H s \\F 

< \\V T ±H L \\ F + -\\VqH s \\f + -\\Vq±H s \\f 

Therefore 

\\VaH s \\ P < 2\\V T ^H L \\ F + \\P a j.Hg\\ P < 2\\V T ±H L \\. + \\V a ±H g \\i 
Putting those all together, we get 

f(L + H L , So - Hs) 
> ,/(L , S ) + (1 - - 2aX)\\V T xH L \U + (1 - - a)X\\V^H s \\i 

This, together with (6), implies that (L , So) is a solution to (2). The uniqueness 
follows from the strong convexity of the objective in (2). □ 

We will provide the criterion of the value of r. 

Theorem 3.2. Let n = ^ = , and r 3 = SlMkj . 

Assume 

T>max(n, t 2 , t 3 ,\\M\\ f ) (7) 

Then, under the other assumptions of Theorem 1.1, (Lq,So) is the unique solution 
to the strongly convex programming (2) with high probability. 
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Proof. In order to check the conditions in Theorem 3.1, we will prove there exists 
a matrix W obeying 

VtW = 

\\w\\<p 

VqxW = -V Q ±(UV* + ±L ) (8) 
\\V a ±(UV* + W + iL - 7^o)||co < BX 
\\Vn(UV* +W- Asgn(S ) + \U - \S Q )\\ F < aX 

Note that W = W L + W s + W Q with W L , W s and W Q have analytical form 
constructed in the paper [T3]. We will check above conditions hold true one by one. 
For simplicity of proof, we denote 



7 := j|7V(L - S )!U $ ■= \\Pn{L - S )\\ 



F 



Without loss of generality, let j3 > 1/2. With the help of the construction [TS] of 
W L , W s and , it is easy to check the first and second conditions hold true. With 
respect to the third condition, according to the paper [TS], we have Vq±W l = 
and Vq±W s = 0. According to the modification of constructed in Lemma 
2.4, we have V Q ±W Q = -V Q ±(UV* + ±L ). It's easy to check that V Q ±W = 
V Q ±W L +V q ±W s +V q ±W q = -V Q ±(UV* + i-L ), which implies that the third 
condition holds true. Consequently, we will provide the last two conditions also 
hold true under some suitable assumptions. Pertaining to the fourth inequality, we 
have 

\\V Q ±(UV* + W+-Lo--S )\\oo 

T T 

< ll^(^*+^ i )lloo + ||^W /5 ||oo 

+||Pn-LW°||oo + -\\Vnx(L - S )|U 

T 

< 4+^+g+^ll^(io-5 )||oo 
A 7 
2 + r 

For the last inequality, noting that Vq(W s ) = Asgn(So) and Vn{W^) = as shown 
in |15j . we can obtain 

\\Vn(UV* +W- Asgn(5 ) + -L - -S )\\ F 

T T 

= \\Tn(UV* +W L + -L --So)\\f 

T T 

< \\Va(UV* + W L )\\ F + -\\T n (Lo - S )\\ F 

T 



< 



A 5 



In order to satisfy the condition (8), we choose a r obeying 



A 7 _ „, , A S _ 

- + - < BX, and - + - < aA (9) 
2 t ~ 4 t ~ 



Therefore 



t > max =-— , t- — (10) 
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Therefore 



Combining (9) with (6), we can obtain 

* 7 A S „ A , , 
- + - + - + -< /?A + aA<A 
2 t 4 T 

A 

Together with (10) and (11), the Theorem 3.2 is established. □ 

In order to simplify the formula (7), we suppose a = 3/8 and f3 = 5/8, which 
satisfy the conditions above. Therefore 

r > max (fe^k, ^l^pMk) (12) 

However, note that the exact lower bound is very hard to get, because we only have 
the information about the given data matrix M . Noting that 

II7VMIU < HMHoo 

And according to the paper [T3] . we have 



\\Va(.L -S )\\ F < ^||M|| F 
Therefore, we can choose 

T ^ nUBC (— A - ' 3A ) 

It's obvious that HMHoo < Therefore, we can obtain the result as follows. 

Theorem 3.3. Assuming 

^ 8yi5||M|| F 
3A 

and the other assumptions of Theorem 1.1, (Lq,Sq) is the unique solution to the 
strongly convex programming (2) with high probability. 



4. Conclusion. In this paper, we have studied strongly convex programming for 
principal component pursuit with reduced linear measurements. We first provide 
sufficient conditions under which the strongly convex models lead to the exact low 
rank and sparse components recovery; Second, we give the criterion of the choice of 
the value of r, which gives very useful advice on how to set the suitable parameters 
in designing efficient algorithms. Especially, it is easy to note that the main results 
of paper [14] is only the special case of ours. In some sense, We extend the result 
of choosing suitable parameters to the general problem. 
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